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Abstract. Consider a random sum rjivi + . . . + ri„Vn, where rji, . . . ,rin are 
i.i.d. random signs and ui, . . . ,Vn are integers. The Littlewood-Offord prob- 
lem asks to maximize concentration probabihties such as P(77ivi + . . .+rinV„ = 
0) subject to various hypotheses on the vi, . . . , i;„ . In this paper we develop an 
inverse Littlewood-Offord theory (somewhat in the spirit of Freiman's inverse 
theory in additive combinatorics) , which starts with the hypothesis that a con- 
centration probability is large, and concludes that almost all of the vi, . . . ,Vn 
are efficiently contained in a generalized arithmetic progression. As an appli- 
cation we give a new bound on the magnitude of the least singular value of a 
random Bernoulli matrix, which in turn provides upper tail estimates on the 
condition number. 



1. Introduction 

Let V be a multiset (allowing repetitions) of n integers «!,...,?;„. Consider a class 
of discrete random walks y^_v on the integers Z, which start at the origin and 
consist of n steps, where at the i*^ step one moves backwards or forwards with 
magnitude Vi and probability /i/2, and stays at rest with probability 1 — /.t. More 
precisely: 

Definition 1.1 (Random walks). For any < /x < 1, let 77'' G {—1,0,1} denote 
a random variable which equals with probability 1 — and ±1 with probability 
/Lt/2 each. In particular, 77^ is a random sign ±1, while r]'^ is identically zero. 

Given v, we define y^.v to be the random variable 

n 
i=l 

where the rj'j^ are i.i.d copies of rj'^ . Note that the exact enumeration wi, . . . , u„ of 
the multiset is irrelevant. 

The concentration probability P^ (v) of this random walk is defined to be the quan- 
tity 

P,,(v) :=maxP(y,,.v =a). (1) 
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Thus we have < Pp(v) < 1 for any /x, v. 

The concentration probabihty (and more gencraUy, the concentration function) is 
a central notion in probabihty theory and has been studied extensively, especially 
by the Russian school (see [21, 19, 18] and the references therein). 

The first goal of this paper is to establish a relation between the magnitude of 
P^(v) and the arithmetic structure of the multiset v = {v\, . . . ,Vn}- This gives an 
answer to the general question of finding conditions under which one can squeeze 
large probability inside a small interval. We will primarily be interested in the case 
H = l, but for technical reasons it will be convenient to consider more general values 
of /X. Generally, however, we think of /x as fixed, while letting n become very large. 

A classical result of Littlewood-Offord [16], found in their study of the number of 

real roots of random polynomials asserts that if all of the f^'s arc non-zero, then 
Pi(v) = 0(n~^/^ logn). The log term was later removed by Erdos [5]. Erdos' 
bound is sharp, as shown by the case vi = ■ ■ ■ = Vn 0. However, if one forbids 
this special case and assumes that the f^'s are all distinct, then the bound can 
be improved significantly. Erdos and Moser [6] showed that under this stronger 
assumption, Pi(v) = 0(n~^/^ Inn). They conjectured that the logarithmic term is 
not necessary and this was confirmed by Sarkozy and Szcmcrcdi [22]. Again, the 
bound is sharp (up to a constant factor), as can be seen by taking vi, . . . , t)„ to be a 
proper arithmetic progression such as 1, . . . ,n. Later, Stanley [24], using algebraic 
methods, gave a very explicit bound for the probability in question. 

The higher dimensional version of Littlewood-Offord's problem (where the Vi are 

non-zero vectors in R'', for some fixed d) also drew lots of attention. Without 
the assumption that the Vi^s are different, the best result was obtained by Frankl 
and Fiiredi in [7], following earlier results by Katona [11], Kleitman [12], Griggs, 
Lagarias, Odlyzko and Shearer [8] and many others. However, the techniques used 
in these papers did not seem to yield the generalization of Sarkozy and Szemeredi's 
result (the 0(n~^/^) bound under the assumption that the vectors are different). 

The generalization of Sarkozy and Szemeredi's result was obtained by Halasz [9] , 
using anal}ftical methods (especially harmonic analysis). Halasz' paper was one of 
our starting points in this study. 

In the above two examples, we see that in order to make P/n(v) large, we have to 
impose a very strong additive structure on v (in one case we set the Vi's to be the 
same, while in the other we set them to be elements of an arithmetic progression). 
We are going to show that this is the only way to make P;;^ (v) large. More precisely, 
we propose the following phenomenon: 

IfFn{v) is large, then v has a strong additive structure. 

In the next section, we are going to present several theorems siipporting this phe- 
nomenon. Let us mention here that there is an analogous phenomenon in combi- 
natorial number theory. In particular, a famous theorem of Preiman asserts that if 
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j4 is a finite set of integers and A + A is small, then A is contained efficiently in a 
generalized arithmetic progression [28, Chapter 5] . However, the proofs of Preiman 
theorem and those in this paper are quite different. 

As an application, we are going to use these inverse theorems to study random 
matrices. Let be an n by n random matrix, whose entries are i.i.d copies of r]'^. 
We are going to show that with very high probability, the condition number of 
is bounded from above by a polynomial in n (sec; Theorem 3.3 below). This result 
has high potential of applications in the theory of probabiUty in Banach spaces, as 
well as in numerical analysis and theoretical computer science. A related result was 
recently established by Rudelson [20], with better upper bounds on the condition 
number but worse probabilities. We will discuss this application with more details 
in Section 3. 

To see the connection between this problem and inverse Littlewood-Offord theory, 
observe that for any v — (wi, . . . (which we interpret as a column vector), 
the entries of the product Mj^v arc independent copies of i^,v Thus we expect 
that is unlikely to lie in the kernel of unless the concentration probability 
P^(v) is large. These ideas are already enough to control the singularity probability 
of (see e.g. [10, 25, 2G]). To obtain the more quantitative condition number 
estimates, we introduce a new discretization technique that allows one to estimate 
the probability that a certain random variable is small by the probability that a 
certain discretized analogue of that variable is zero. 

The rest of the paper is organized as follows. In Section 2 we state our main in- 
verse theorems, and in Section 3 we state our main results on condition numbers, 
as well as the key lemmas used to prove these results. In Section 4, we give some 
brief applications of the inverse theorems. In Section 7 we prove the result on 
condition numbers, assuming the inverse theorems and two other key ingredients: 
a discretization of generalized progressions and an extension of the famous result 
of Kahn, Komlos and Szemeredi [10] on the probability that a random Bernoulli 
matrix is singular. The inverse theorems are proven in Section 6, after some pre- 
liminaries in Section 5 in which we establish basic properties of P;:i(v). The result 
about discretization of progressions are proven in Section 8. Finally in Section 9 
we prove the extension of Kahn, Komlos and Szemeredi [10] . 

Let us conclude this section by setting out some basic notation. A set 

P = {c + mioi H h mdad\Mi < nii < M/} 

is called a generalized arithmetic progression (GAP) of rank d. It is convenient to 
think of P as the image of an integer box B := {(mi, . . . ,md)\Mi < rrii < M-} in 
Z*^ under the linear map 

$ : (mi, . . . ,md) c + miai H h m^ad. 

The numbers are the generators of P. In this paper, all GAPs have rational 

generators. A GAP is proper ii $ is one to one on B. The product Y[i=i (Mf— M^+l) 
is the volume of P. If Mi = —M^ and c = (so P = —P) then we say that P is 
symmetric. 
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For a set A of reals and a positive integer k, we define the iterated sumset 

kA:= {ai-\ h ak\ai G A}. 

One should take care to distinguish the sumset kA from the dilate k ■ A, defined for 
any real k as 

k - A:= {ka\a G A}. 

We always assume that n is sufficiently large. The asymptotic notation OQ, o(), 
r2(), 6() is used under the assumption that n ^ oo. Notation such as Od{f) means 
that the hidden constant in O depends only on d. 

We thank the referee for many detailed comments and corrections. 

2. Inverse Littlewood-Offord theorems 

Let us start by presenting an example when P^(v) is large. This example is the 
motivation of our inverse theorems. 

Example 2.1. Let P be a symmetric generalized arithmetic progression of rank d 
and volume V; we view d as being fixed independently of n, though V can grow with 
n. Let vi, . . . ,Vn be (not necessarily different) elements of V. Then the random 
variable Y^ v = Vi'^'i takes values in the GAP nP which has volume n'^V. 

From the pigeonhole principle it follows that 

In fact, the central limit theorem suggests that F^(v) should typically be of the 
order of n~^/'^V~^. 

This example shows that if the elements of v belong to a GAP with small rank 
and small volume then P^(v) is large. One might hope that the inverse also holds, 
namely, 

7/P^(v) is large, then (most of) the elements of v belong to a GAP with small 
rank and small volume. 

In the rest of this section, we present three theorems, which support this statement 
in a quantitative way. 

Definition 2.2 (Dissociativity). Given a multiset w = {wi, . . . , Wr} of real num- 
bers and a positive number k, we define the GAP Q(w, k) and the cube S'(w) as 
follows: 

Q(w, k) := {miwi H + mrWr] — k < rui < k} 

S'(w) := {eiwi H h e^Wrlci € {-1, 1}}. 

We say that w is dissociated if S'(w) does not contain zero. Furthermore, w is 
k-dissociated if there do not exist integers —k < mi, . . . , < k, not all zero, such 
that miwi + . . . + nirWr = 0. 
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Our first result is the following simple proposition: 

Proposition 2.3 (Zeroth inverse theorem). Let v = {vi,... ,Vn} be such that 
Pi(v) > 2"''"^ for some integer d>0. Then v contains a subset w of size d such 
that the cube S{w) contains vi,. . . ,Vn- 

The next two theorems are more involved and also more useful. In these two theo- 
rems and their corollaries, we assume that k and n are sufficiently large, whenever 
needed. 

Theorem 2.4 (First inverse theorem). Let fi be a positive constant at most 1 and 
let d be a positive integer. Then there is a constant C = C{i2,d) > 1 such that the 
following holds. Let k>2 he an integer and let v = {v\, ... , be a multiset such 
that 

Pm(v) >C(Ai,d)fc-f 
Then there exists a k-dissociated multiset w = {wi, . . . , Wr} such that 

(1) r < d — 1 and Wi, . . . ,Wr are elements ofv; 

(2) T/ie umon U^g2 i<r<fe t''3(^' contains all but k"^ of the integers vi, .. . ,v, 
(counting multiplicity). 

This theorem should be compared against the heuristics in Example 2.1 (setting k 
equal to a small multiple of ^/n). In particular, notice that the GAP Q^w, k) has 
very small volume, only 0{k'^~^). 

The above theorem does not yet show that most of the elements of v belong to a 
single GAP. Instead, it shows that they belong to the union of a few dilates of a 
GAP. One could remove the unwanted - factor by clearing denominators, but this 
costs us an exponential factor such as k\, which is often too large in applications. 
Fortunately, a more refined argument allows us to eliminate these denominators 
while losing only polynomial factors in k: 

Theorem 2.5 (Second inverse theorem). Let n he a positive constant at most one, 

e he an arbitrary positive constant and d be a positive integer. Then there are 
constants C = C{iJ,e,d) > 1 and fcg = ko{iJ, e,d) > 1 such that the following holds. 
Let k> ko he an integer and let v = {vi, ... ,Vn} be a multiset such that 

> Ck-'^. 

Then there exists a GAP Q with the following properties 

(1) The rank of Q is at most d — 1; 

(2) The volume of Q is at most 

(3) Q contains all hut a,t m,ost ek^logk elem,ents of w (counting multiplicity): 

(4) There exists a positive integer s at most /c'^^' such that su G v for each 
generator u of Q. 
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Remark 2.6. A small number of exceptional elements cannot be avoided. For in- 
stance, one can add O(logfc) completely arbitrary elements to v, and decrease 
P/n(v) by a factor of k~^^^^ at worst. 

For the applications in this paper, the following corollary of Theorem 2.5 is con- 
venient. 

Corollary 2.7. For any positive constants A and a there is a positive constant A' 
such that the following holds. Let n he a positive constant at most one and assume 
that V = {wi, . . . ,Vn\ is a multiset of integers satisfying P^(v) > . Then there 
is a GAP Q of rank at most A' and volume at most which contains all but at 
most n" elements ofv (counting multiplicity). Furthermore, there exists a positive 
integer s < n"^ such that su Gv for each generator u of Q. 

Remark 2.8. The assumption Pp(v) > in all statements can be replaced by 
the following more technical, but somewhat weaker assumption, that 



/ T\\{l-fi)+ficos2nVi^\d^>n-'^. 

The right hand side is an upper bound for P^(v), provided that jj is sufficiently 
small. Assuming that ^ij,{v) > n~^, what we will really use in the proofs is the 
consequence 

/ T\\{1- iJ.)+ iJ,cos2TrVi^\d^>n~'^. 

(Sec Section 5 for more details.) This weaker assumption is useful in applications 
(see [27]). 

The vector versions of all three theorems (when the Vi's are vectors in R'^, for any 
positive integer r) hold, thanks to Preiman's isomorphism principle ( see, e.g., [28, 
Chapter 5]). This principle allows us to project the problem from R"^ onto Z. The 
value of r is irrelevant and does not appear in any quantitative bound. In fact, one 
can even replace R*^ by any torsion free additive group. 

Finally, let us mention that in an earlier paper [26] we introduced another type of 
inverse Littlewood-Offord theorem. This result showed that if P^(v) was compara- 
ble to Pi (v) , then v could be efficiently contained inside a GAP of bounded rank 
(see [26, Theorem 5.2] for details). 

We shall prove these inverse theorems in Section 6, after some combinatorial and 
Fourier-analytic preliminaries in Section 5. For now, let us take these results for 
granted and turn to an application of these inverse theorems to random matrices. 
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3. The condition number of random matrices 

If M is an n X n matrix, we use 

£71 (M) := sup ||Ma;|| 

a:eR",||x|| = l 

to denote the largest singular value of M (this parameter is also often called the 
operator norm of M). Hero of course ||a;|| denotes the Euclidean magnitude of a 
vector X S R". If M is invertible, the condition number c{M) is defined as 

c(M) := ai{M)(7i{M-^). 

We adopt the convention that c{M) is infinite if M is not invertible. 

The condition number plays a crucial role in applied linear algebra and computer 
science. In particular, the complexity of any algorithm which requires solving a 
system of linear equations usually involves the condition number of a matrix [1, 23]. 
Another area of mathematics where this parameter is important is the theory of 
probability in Banach spaces (see [15, 20], for instance). 

The condition number of a random matrix is a well-studied object (see [3] and the 
references therein). In the case when the entries of M are i.i.d Gaussian random 
variables (with mean zero and variance one), Edelman [3], answering a question of 
Smale [23] showed 

Theorem 3.1. Let Nn be anxn random matrix, whose entries are i.i.d Gaussian 
random variables (with mean zero and variance one). Then E(lnc(iV„)) = Inn + 
c + o(l), where c> is an explicit constant. 

In application, it is usually useful to have a tail estimate. It was shown by Edelman 
and Sutton [4] that 

Theorem 3.2. Let Nn be an by n random matrix, whose entries are i.i.d Gaussian 
random variables (with mean zero and variance one). Then for any constant A> 0, 

P(c(iV„) > n^+') = OA(n-^). 

On the other hand, for the other basic case when the entries are i.i.d Bernoulli 
random variables (copies of ry^), the situation is far from being settled. Even to 
prove that the condition number is finite with high probability is a non-trivial task 
(sec [13]). The techniques used to study Gaussian matrices rely heavily on the 
explicit joint distribution of the eigenvalues. This distribution is not available for 
discrete models. 

Using our inverse theorems, we can prove the following result, which is comparable 
to Theorem 3.2, and is another main result of this paper. Let be the n by n 
random matrix whose entries are i.i.d copies of r]^. In particular, the Bernoulli 
matrix mentioned above is the case when = 1. 

Theorem 3.3. For any positive constant A, there is a positive constant B such that 
the following holds. For any positive constant fi at most one and any sufficiently 
large n 



8 



TERENCE TAO AND VAN H. VU 



P(c(M^) > n^) < n-^. 

Given an invertible matrix M of order n, we set (T„(M) to be the smallest singular 
value of M: 

(T„(M) := mill ||Ma;||. 

a:eR",||a:||=l 

Then we have 

c(M) =<7i(M)/a„(M). 

It is well known that there is a constant such that the largest singular value 
of is at most C^n^^^ with exponential probability 1 — exp(— 0^(n)) (see, for 
instance [14]). Thus, Theorem 3.3 reduces to the following lower tail estimate for 

the smallest singular value of (t„(M): 

Theorem 3.4. For any positive constant A, there is a positive constant B such that 
the following holds. For any positive constant fi at most one and any sufficiently 
large n 

PK(M^) < n-^) < n-^. 

Shortly prior to this paper, Rudelson [20] proved the following result. 

Theorem 3.5. Let < /x < 1. There are positive constants ci(/x), C2(/x) such that 
the following holds. For any e > ci(/i)n~^/^ 

PK(M^) < C2(M)en-3/2) < ^_ 

In fact, Rudelson's result holds for a larger class of matrices. The description of 
this class is, however, somewhat technical so we refer the reader to [20] for details. 

It is useful to compare Theorems 3.4 and 3.5. Theorem 3.5 gives an explicit 
dependence between the bound on c7„ and the probability, while the dependence 
between A and B in Theorem 3.4 is implicit. Actually our proof does provide an 
explicit value for B, but it is rather large and we make no attempt to optimize it. 
On the other hand. Theorem 3.5 does not yield a probability better than n~^^^. In 
many applications (especially those involving the union bound), it is important to 
have a probability bound of order n~"^ with arbitrarily given A. 

The proof of Theorem 3.4 relies on Corollary 2.7 and two other ingredients, which 

are of independent interest. In the rest of this section, we discuss these ingredients. 
These ingredients will then be combined in Section 7 to prove Theorem 3.4. 

3.6. Discretization of GAPs. Let P be a GAP of integers of rank d and volume 

V. We show that given any specified scale parameter Rq, one can "discretize" 
P near the scale Rq- More precisely, one can cover P by the sum of a coarse 
progression and a small progression, where the diameter of the small progression is 
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much smaller (by an arbitrarily specified factor of S) than the spacing of the coarse 
progression, and that both of these quantities are close to Ro (up to a bounded 

power of SV) . 

Theorem 3.7 (Discretization). Let P C Z be a symmetric GAP of rank d and 
volume V . Let Ro,S be positive integers. Then there exists a scale R> 1 and two 
GAPs Psmaib -Psparse of rational numbers with the following properties. 

• (Scale) R={SV)°-'^^^Rq. 

• (Smallness) Psmaii has rank at most d, volume at most V, and takes values 

in [-R/S,R/S]. 

• (Sparseness) Pgparse has rank at most d, volume at most V, and any two 
distinct elements of S Psparse o,re separated by at least RS. 

• (Covering) P C Psmall + -Psparse- 

This theorem is elementary but is somewhat involved and the detailed proof will 
appear in Section 8. Let us, at this point, give an informal explanation, appealing 
to the analogy between the combinatorics of progressions and linear algebra. Recall 
that a GAP of rank d is the image ^{B) of a d-dimensional box under a linear map 
This can be viewed as a discretized, localized analogue of the object ^{V), 
where $ is a linear map from a d-dimcnsional vector space V to some other vector 
space. The analogue of a "small" progression would be an object ^{V) in which i> 
vanished. The analogue of a "sparse" progression would be an object ^{V) in which 
the map <I> was injcctive. Theorem 3.7 is then a discretized, localized analogue of 
the obvious linear algebra fact that given any object of the form ^{V), one can split 
V = V-mail + Vspaise for which $(ysmaii) IS Small and ^(V^parse) is sparse. Indeed 
one simply sets T4maii to be the kernel of $, and Vgparsc to be any complementary 
subspace to Kmaii in V- The proof of Theorem 3.7 that we give follows these broad 
ideas, with Psmaii being essentially a "kernel" of the progression P, and Psparse 
being a kind of "complementary progression" to this kernel. 

To oversimplify enormously, we shall exploit this discretization result (as well as 
the inverse Littlewood-Offord theorems) to control the event that the singular value 
is small, by the event that the singular value (of a slightly modified random matrix) 
is zero. The control of this latter quantity is the other ingredient of the proof, to 
which we now turn. 

3.8. Singularity of random matrices. A famous result of Kahn, Komlos and 
Szemeredi [10] asserts that the probability that is singular (or equivalently, that 
cr„(M^) = 0) is exponentially small: 

Theorem 3.9. There is a positive constant s such that 

P(a„(M^) = 0) <(!-£)". 

In [10] it was shown that one can take e = .001. Improvements on s are obtained 
recently in [25, 26] . The value of e does not play a critical role in this paper. 
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To prove Theorem 3.3, we need the following generalization of Theorem 3.9. Notice 
that the row vectors of are i.i.d copies of , where = (r/}, . . . , ry^) and r}} 
are i.i.d copies of r}^. By changing 1 to /x, we can define in the obvious manner. 
Now let F be a set of I vectors y\,...,yi in R" and M^'^ be the random matrix 
whose rows are X^, . . . , X^_i, yi, . . . ,yi, where X-^ are i.i.d copies of X^^. 

Theorem 3.10. Let < fi < 1, and let I be a non-negative integer. Then there 
is a positive constant e = e{ii, I) such that the following holds. For any set Y of I 
independent vectors from R", 

PK(M„^-^)=0) <(!-£)". 

Corollary 3.11. Let < /z < 1. Then there is a positive constant e = e{n) such 
that the following holds. For any vector y G R", the probability that there are 
Wi, . . . ,Wn-i, not all zeros, such that 

y = X^wi + ... X^_i-u;„_i 

is at most (1 — e)". 

We will prove Theorem 3.10 in Section 9 by using the machinery from [25]. 
4. Some quick applications of the inverse theorems 

The inverse theorems provide effective bounds for counting the number of "ex- 
ceptional" collections v of numbers with high concentration probability; see for 
instance [26] for a demonstration of how such bounds can be used in applications. 
In this section, we present two such bounds that can be obtained from the inverse 
theorems developed here. In the first example, let e be a positive constant and M 
be a large integer and consider the following question: 

How many sets w of n integers with absolute values at most M are there such that 
Pi(v) > e ? 

By Erdos' result, all but at most 0(e~^) of the elements of v are non-zero. Thus 
we have the upper bound (^^■^{2M + 1)'-''^'^ ^ for the number in question. Using 
Proposition 2.3, we can obtain a better bound as follows. There are only M''^^"' 
ways to choose the generators of the cube. After the cube is fixed, we need to choose 
0(e~^) non-zero elements inside it. As the cube has volume 0(e~^), the number of 
ways to do this is {\)'^^^ Thus, we end up with a bound 

jy-0(lne-i)(1^0(e-^) 

which is better than the previous one if M is considerably larger than e~^. 

For the second application, we return to the question of bounding the singularity 
probability P(ct„(M^) = 0) studied in Theorem 3.9. This probabihty is conjectured 
to equal (1/2 -|- o(l))", but this remains open (see [26] for the latest results and 
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some further discussion). The event that is singular is the same as tlic event 
that there exists some non-zero vector v G R" such that M^^v = 0. For simphcity, 
we use the notation M„ instead of in the rest of this section. It turns out that 
one can obtain the optimal bound (1/2 + o(l))" if one restricts v to some special 
set of vectors. 

Let ill be the set of vectors in R" with at least 3n/log2n coordinates. Komlos 
proved the following: 

Theorem 4.1. The probability that MnV = for some non-zero v G fli is (1/2 + 

0(1))". 

A proof of this theorem can be found in Bollobas' book [2] . 

We are going to consider another restricted class. Let C be an arbitrary positive 
constant and let fl2 be the set of integer vectors in R" where the coordinates have 
absolute values at most n^. Using Theorem 2.4, we can prove 

Theorem 4.2. The probability that MnV = for some non-zero v G CI2 is (1/2 + 

0(1))". 

Proof The lower bound is trivial so we focus on the upper bound. For each non-zero 

vector V, let p{v) be the probability that X ■ v = 0, where X is a random Bernoulli 
vector. From independence we have P{MnV = 0) = p{v)". Since a hyperplane can 
contain at most 2"~^ vectors from {—1, +1}", p{v) is at most 1/2. For j = 1,2, . . . , 
let Sj be the number of non-zero vectors v in ^2 such that 2~^~^ < p{v) < . 
Then the probability that MnV = for some non-zero G SI2 is at most 



Y,{2-TSj. 

i=i 

Let us now restrict the range of j. Notice that if p{v) > nr^^"^, then by Erdos's 
result (mentioned in the Introduction) most of the coordinates of v are zero. In 
this case, by Theorem 4.1 the contribution from these v is at most (1/2 + o(l))". 
Next, since the number of vectors in is at most {2n'~^ + 1)" < n^*-^"*""^^", we can 
ignore those j where 2~^ < n~'^~^. Now it suffices to show 

^ (2-^)"5,=o((l/2)"). 

ri-C-2<2-3<„-i/3 

For any relevant j, we can find an integer d = 0{1) and a positive number e = 0(1) 
such that 

Set k := n^ Thus 2"^ > k-'^ and we can use Theorem 2.4 to estimate Sj. 
Indeed, by invoking this theorem, we see that there are at most (^2) {2n^ + 1)*^ = 
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j^o{k -J _ 7T,o(") ways to choose the positions and values of exceptional coordinates 
of V. Furthermore, There is only (2n*^ + 1)''"^ = nP'^^^ ways to fix the generalized 
progression P := Q(w, k). 

Notice that the elements of P arc polynomially bounded in n. Such integers 
have only n°^^) divisors. So if P is fixed then any (non-exceptional) coordinate 
of V has at most |P|n''^"'^-' possible values. This means that once P is fixed, the 
number of ways to set the non-exceptional coordinates of v is at most (n°(^^ = 
{2k + Putting these together, 

Sj < „o(fe')fc(''-i+«(i))". 
As fc = and 2"^' < n-^'^-^/^'S it follows that 

logn 

Since there are only 0(log n) relevant j, we can conclude the proof by summing the 
bound over j. ■ 

5. Properties of P;h(v) 

In order to prove the inverse Littlewood-Offord theorems in Section 2, we shall first 
need to develop some useful tools for estimating the quantity ^'^{■v). That shall be 
the purpose of this section. We remark that the tools here are only used for the 
proof of the inverse Littlewood-Offord theorems in Section 6 and are not required 
elsewhere in the paper. 

It is convenient to think of v as a word, obtained by concatenating the numbers 

Vi-. 

V = ViV2 ...Vn- 

This will allow us to perform several operations such as concatenating, truncating 
and repeating. For instance, if v = ?;i . . . f;„ and w = wi . . . Wm, then 

n m 

P^(vw) = max ( ^ 77f + ^ v'^^+jWj = a) 

where r]-^,l < k < n + m arc i.i.d copies of r]^. Furthermore, we use v*' to denote 
the concatenation of k copies of v. 

It turns out that there is a nice calculus concerning the expressions IP/i(v), espe- 
cially when n is small. The core properties are summarized in the next lemma. 

Lemma 5.1. The following properties hold. 

• P/n(v) is invariant under permutations of v. 
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• For any words v, w 

(v)P^ (w) < (vw) < P^ (v) . (2) 

• For any < < 1, any < fi' < ^i/A, and any word v, 

Pm(v)<P^'(v). (3) 

• For any number < < 1/2 and any word v, 

P^(v) < P^/.Cv'^). (4) 

• For any number < fi < 1/2 and any words v, wi, . . . , we have 

(m \ 
np^(vw7)j . (5) 

• For any number < /i < 1/2 and any words v, wi, . . . , w^, there is an 
index 1 < j < m such that 

P^(vwi...w„) <P^(vw™). (6) 



Proof The first two properties are trivial. To verify the rest, let us notice from 
Fourier analysis that 



'{rii'-^v, + ... + rii^Kn = a)= e'^^^ 17(1 - n + Mcos(27r^;,0) d^- 

Jo ,=1 (7) 



When < < 1/2, the expression 1 — /z + cos(27rt;j^)) is positive, and we thus 
have 



P^(v) = P(F^,v = 0) = 17(1 - M + Mcos(27r^;,0) d^- (8) 

To prove (3), notice that for any 0</u<l,0</i'</Lt/4 and any 9 we have the 
elementary inequality 

1(1 - n)+ ncose\ < (1 - A*') +m' cos 26*. 

Using this, we have 



14 



TERENCE TAO AND VAN H. VU 



.1 

W < n - + Mcos(27rt;,0)l 

»1 n 

< / 17(1 - + m' cos(47ru,0) 
= / r[(l-M' + /^'cos(47ru,0)c;C 



where the next to last equality follows by changing ^ to 2^ and considering the 
periodicity of cosine. 

Similarly, observe that for < /it < 1/2 and fc > 1 we have 



(1 - /X + /xcos(27ri;j^)) < (1 - ^ + ^ cos(27rz;j^))''. 



Indeed from the concavity of log(l — i) when < f < 1, we have log(l— < k log(l — 
|), and the claim follows by exponentiating this with t := /i(l — cos(27rt;j^))). This 
proves (4). 

Finally, (5) is a consequence of (8) and Holder's inequality, while (6) follows directly 
from (5). ■ 



Now we consider the distribution of the equal-steps random walk -q^ + ■ ■ ■ + rjf^^^ = 
y^ im . Intuitively, this random walk is concentrated in an interval of length + 
fmiY^^) and has a roughly uniform distribution in the integers in this interval 
(though when fj, is close to 1, parity considerations may cause l^^im to favor the 
even integers over the odd ones, or vice versa); compare with the discussion in 
Example 2.1. The following lemma is a quantitative version of this intuition. 

Lemma 5.2. For any < /x < 1 and m> 1 we have 

P^(l") = sup P« + . . . + r,^ = a) = 0((Mm)-i/2). (9) 

a 

In fact, we have the more general estimate 

P« + . . . + C = «) = 0((T-i + (Mm)-i/2)P« + ... + n^e[a-T,a + T]) 

(10) 

for any a e Z and r > 1. 

Finally, if t > 1 and S is any t -separated set of integers (i.e. any two distinct 
elements of S are at least r apart) then 

P« + ... + V^&S)< 0{r-' + inm)-'/^). (11) 



INVERSE LITTLEWOOD-OFFORD AND CONDITION NUMBER 



15 



Proof We first prove (9). From (3) we may assume /j. < 1/4, and then by (8) we 
have 

P^(l'")= /V-M + Mcos(20r 
Jo 

Next we use the elementary estimate 

1 - M + Aicos(27r0 < exp(-;u||ef /lOO), 

where ||^|| denotes the distance to the nearest integer. This imphes that P^(l™) is 
bounded from above by /^^ exp(— /irn||^|p/100)c?^, which is of order 0((^m)~^/^) 
(to see this notice that for ^ > 1000(/im)~^/^ the function exp(— /xm||^|p/100) is 
quite small and its integral is negligible). 

Now we prove (10). We may assume that r < (/im)-'^/^, since the claim for larger 
T follows automatically. By symmetry we can take a > 2. 

For each integer a, let Ca denote the probability 

Ca:=P(r?i''^ + ...+r?W=a). 

Direct computation (letting i denote the number of Ty*-^-* variables which equal zero) 
yields the explicit formula 

-|:("')(-''rw^r-<,,r-'„/.)' 

with the convention that the binomial coefficient (^) is zero when b is not an integer 
between and a. This in particular yields the monotonicity property Ca > Ca+2 
whenever a > 0. This is already enough to yield the claim when a > t, so it 
remains to verify the claim when a < t. Now the random variable 7]^ + . . . + rj^ is 
symmetric around the origin and has variance /xm, so from Chebyshev's inequality 
we know that 



0<a<2(/nTO)i/2 



From (9) we also have Ca = 0{{f-im)~^^'^) for all a. From this and the monotonicity 
property Ca > Ca+2 and the pigeonhole principle we see that Ca = G((/xm)~^/^) 
either for all even < a < {iimY/'^, or for all odd < a < (/xm)^/^. In either case, 
the claim (10) is easily verified. The bound in (11) then follows by summing (10) 
over all a G 5 and noting that Ca = 1. ■ 



One can also use the formula for Ca to prove (9) as well. The simple details are 
left as an exercise. 
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6. Proofs of the inverse theorems 

We now have enough machinery to prove the inverse Littlewood-Offord theorems. 
We first give a quick proof of Proposition 2.3: 

Proof [of Proposition 2.3] Suppose that the conclusion failed. Then an easy 
greedy algorithm argument shows that v must contain a dissociated subword w = 
{wi, . . . , iVd+i) of length d+ 1. By (2), we have 

2-d-i ^ p^(-^-) < Pi(w). 

On the other hand, since w is dissociated, all the sums of the form r]i'Wi+. . . rjd+iWd+i 
are distinct and so Pi(w) < 2~'^~^, yielding the desired contradiction. ■ 

To prove Theorem 2.4, we modify the above argument by replacing the notion 
of dissociativity by fc-dissociativity. Unfortunately this makes the proof somewhat 
longer: 

Proof [of Theorem 2.4] We construct an fc-dissociated tuple {wi, . . . ,Wr) for some 
0<r<d— Iby the following algorithm: 

• Step 0. Initialize r = 0. In particular, {wi, . . . , Wr) is trivially fc-dissociated. 
Prom (4) we have 

P^/4d(v'') > P^/4(V) > P^V). (12) 

• Step 1. Count how many 1 < j < n there are such that (wi, ... ,Wr,Vj) is 
^-dissociated. If this number is less than fc^ , halt the algorithm. Otherwise, 
move on to Step 2. 

• Step 2. Applying the last property of Lemma 5.1, we can locate a vj such 
that {wi, . . . , Wr, Vj) is fc-dissociated, and 

P^/4d(v'^-'-«;f ...wf)< P^/4d(v'^-'-i.j;f . . . wfvf). (13) 

We then set Wr+i := vj and increase r to r + 1. Return to Step 1. Note 
that {wi, . . . ,iUr) remains fc-dissociated, and (12) remains true. 

Suppose that we terminate at some step r < d — 1. Then we have an r-tuple 
{wi, . . . , Wr) which is fc-dissociated, but such that (wi, ... ,Wr,Vj) is fc-dissociated 
for at most fc^ values of Vj. Unwinding the definitions, this shows that for all but 
at most fc^ values of vj, there exists r e [1, fc] such that tvj e (5(w, fc), proving the 
claim. 

It remains to show that we must indeed terminate at some step r < d—1. Assume 
(for a contradiction) that we have reached step d. Then we have an fc-dissociated 
tuple {wi, . . . ,Wd), and by (12), (13) we have 

P^(V) < P^/4d(«^f ■•■wf)= P(V<i,-f ..-f = 

Let r C Z"^ be the lattice 

r := {(mi, . . . , rud) e Z"^ : miWi -|- . . . -|- iridWd = 0}, 
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then by using independence we can write 

d 

P^(V) < P(i;/4,,.f ....f = 0) = E n P(^M/4rf,l^^ = ^i)- 

(mi,...,md)eri=i (14) 
Now we use a volume packing argument. From Lemma 5.2 we have 

P(^M/4d,l^^ = ^) = E ^0^,/Ad,l^^ = ^')) 

m'em+(-fe/2,fe/2) 

and hence from (14) we have 

P;.(v) < 0^,rf(fc-'^ E 

(mi,... ,m<j)er 

E np(^./4.,i^^=K))- 

(m'l,... ,m'^)e(mi,... ,md)+{-k/2,k/2y j=l 

Since (u'l, .... w,i) is fc-dissociated, all the {m[, . . . , m'^) tuples in r+ (— fc/2, k/2Y 
are different. Thus, we conclude 

d 

w < o,,d{k-'' E ^np(^^/4<i,i'^^ =^i))- 

(mi,... ,m<i)eZ'* 

But from the union bound we have 

d 

E ^np(^./4.,i^^="^^)=i' 

SO 

Pm(v) < 0^,d(A;-^). 

To complete the proof, set the constant C = C{ii,d} in the theorem to be larger 
than the hidden constant in 0^^d{k~'^)- ■ 

Remark 6.1. One can also use the Chernoff bound and obtain a shorter proof 
(avoiding the volume packing argument) but with an extra logarithmic loss in the 
estimates. 

Finally we perform some additional arguments to eliminate the ^ dilations in 
Theorem 2.4 and obtain our final inverse Littlewood-Offord theorem. The key will 
be the following lemma. 

Given a set S and a number v. The torsion of v with respect to S is the smallest 
positive integer t such that tv £ S. If such r does not exists, we say that v has 
infinite torsion with respect to S. 

The key new ingredient will be the following lemma, which asserts that adding 
a high torsion element to a random walk reduces the concentration probability 
significantly. 
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Lemma 6.2 (Torsion implies dispersion). Let Q < n < I and consider a GAP 
Q := — Li < Xi < Li}. Assume that Wd+i has finite torsion r with 

respect to 2Q. Then there is a constant depending only on /x such that 

• • • W^'Wf+i) < C^r-'¥^{W,^^ . . . W^'^). 
Proof Let a be an integer such that 

d Li 

P4<^ • • • W^^Wll^) = P(5^ V^,i + H^d+i E <<i+i = «)' 

where the r]j - are i.i.d. copies of r]^^. It suffices to show that 

p(E E <i + ^^+^ E ^d+i = «) = o^(t-^)p^«^ . . . w^^). 

i=i j=i i=i 

Let 5* be the set of all m e [— t^,t^] such that Q + mWd+i contains a. Observe 
that in order for X^j^^ VK^ Y^-jLi Vja + Wd+i '^'j,d+i equal a, the quantity 

E^=i ^j^d+i ™ust lie in S. By the definition of P^(W^/'i . . . W^^") and Bayes identity, 
we conclude 



d Li 

p(E E <i + E <a+i = «) < • • • w^^)p(E <rf+i e 
i=i j=i j=i j=i 

Consider two elements x,y G S. By the definition of S, {x — y)v G Q — Q = 2Q. 
From definition of t, ja; — ?y| is either zero or at least r. This implies that S is 
r-separated and the claim now follows from Lemma 5.2. ■ 

We will also need the following technical lemma. 

Lemma 6.3. Consider a GAP Q(w, L). Assume that v is an element with (finite) 
torsion r with respect to Q{w,L). Then 

g(w, L) + Q{v, L')c-- g(w, L{L' + r)). 

T 

Proof Assume w = wi . . .Wr- We can write as ^ o-i'^it where |aj| < L. An 
element y in Q{w, L) + Q(v, L') can be written as 



y = E + 
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where \xi\ < L and \x\ < L' . Substituting u, we liave 



y = ^ XiWi + X- ^ aiWi = - ^ Wi{TXi + xtti), 

i=l i=l i=l 

where \TXi + xai\ < tL + L'L. This concludes the proof. ■ 

Proof [of Theorem 2.5] We begin by running the algorithm in the proof of Theorem 
2.4 to locate a word w of length at most d—1 such that the set Ui<t<*; t ' ^) 
covers all but at most elements of v. Set vl^l be the word formed by removing 
the (at most k^) exceptional elements from v which do not lie in Ui<r<fe r ' 

By increasing the constant ko in the assumption of the theorem, we can assume, 
in all arguments below, that k is sufficiently large, whenever needed. 

By (2), (3) 

PM/4d(v[°lw'=') > P^/4d(vw'=') > P^/4<i(v)P^/4<i(vw'=') > fc-'^P^/4d(vw'='). 

(15) 

In the following, we assume that there is at least one non-zero entry in w, as 
otherwise the claim is trivial. 

Now we perform an additional algorithm. Let K = K{ij,,d,e) > 2 be a large 
constant to be chosen later. 

• Step 0. Initialize i = and Set Qa := Q{w, fc^) and as above. 

• Step 1. Count how many v G vl*"^] having torsion at least K with respect 
to 2Qj_i. (We need to have the factor 2 here in order to apply Lemma 
6.2.) If this number is less than k'^, halt the algorithm. Otherwise, move 
on to Step 2. 

• Step 2. Locate a multiset S of k'^ elements of v['~^l with torsion at least 
K with respect to 2Qi-i. Applying (6), we can find an element v G S such 

that 

p^/4d(v[*-^iw'=Vi^' . . . w;h') < p^/4d(vt^iw'=Vi^' . . . w;h'v''') 

where is obtained from by deleting S. 

Let Ti be the torsion of v with respect to 2Qi-i. Since every element of 
vl^l has torsion at most k with respect to Qq, K < < k. We then set 
Wi := V, Qi := Qi-i + Q{Wi, rf), increase i to i + 1 and return to Step 1. 

Consider a stage i of the algorithm. From construction and induction and (15), 
we have a word Wi . . .Wi with 

P/./4d(v['lw'=Vi^' . . . Wl') > P(v[°lw'=') > fc-'^P(w'='). 
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On the other hand, by applying Lemma 6.2 iteratively, we have 



It follows that nj=i(C/^'0~^) ^ equivalently ]\)^x{C-'^Tj) < k'^. Recall 

that Tj > K. Thus by setting K sufficiently large (compared to C^, (i and 1/e), we 
can guarantee that 



H Tj < (16) 

where e is the constant in the assumption of the theorem. It also follows that the 
algorithm must terminate at some stage D < log^ /;<^+«/2rf < (d + 1) log^^ k. 

Let us take a look at the final set Qd- Applying Lemma 6.3 iteratively we have 



QDC{l[-)-Q{w,Ln) 

where Lq := k^ and 



Li := Li_,{Ti + rf) < (1 + l/K)Li_^Tl (17) 

We now show that the GAP Q := 7^ • {2K\)Q{y/, Ld) = 7^ •(5(w, 2K\Ld) satisfies 
the claims of the theorem. 

• (Rank) We have rank((5) = rank((5(w, L^)) = rank((5o) = r < d — 1, as 
showed in the proof of the previous theorem. 

• (Volume) Wc have VoI(Q) = {2K\YVol{Q{w, Ld)) = 0(Vol(Q(w, Lr,)))- 
On the other hand, by (16) and (17) 

D 

Vo1(0(w,Ld)) = {2LD+iy' < {^LdY = 0{(k^ K)T]y) = 0{(y+^^''+''^''\l+K)''y ). 

By definition, D < log^ fc'^+^Z^'' < logfc, given that K is sufficiently 
large compared to d. Thus (1 + ^/K)^ < exp[D/K) < k^/^ which implies 
that 

V0l(g(w,iz5)) = 0(fc-(2+2(<i+e/2d)+l/K)^ ^ ^^^2(d^-l)+e) 
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provided that r < d — 1 and K is sufficiently large compared to d and 1/e. 
(The asymptotic notation here is used under the assumption that k —* oo.) 

• {Number of exceptional elements) At each stage in the second algorithm, we 
discard a set of elements, thus all but Dk^ < {d+ l)k'^ log^^ k) elements 
of v[°l have torsion at most K with respect to 2Qd. As Qd C Q{w,Ljj) 
and < A;^, it follows that all but at most 

(d+ l)fc^log^fc + fc^ 

elements of v have torsion at most K with respect to 2Q{w, Ld) = Q(w, 2Ld). 
By setting K sufficiently large compared to d and 1/e, we can guarantee 
that 

{d + log^ k + k"^ < efc^ log k. 
To conclude, notice that any element with torsion at most K with respect 
to (3(w, 2Ld) belongs to Q := • (3(w, 2K\Ld)- Thus, Q contains all but 
at most efc^ log k elements of v. 

• {Generators) The generators of • Q{w,2K\Ld) are Wi, 1 < 

i < r. Since Wi € -v and H^i'^i < kd+e/2d ^ o(A:''+^), the claim about 
generators follows. 

The proof is complete. ■ 



7. The smallest singular value 



In this section, we prove Theorem 3.4, modulo two key results. Theorem 3.7 and 
Corollary 3.11), which will be proved in later sections. 

Let B > 10 be a large number (depending on ^4) to be chosen later. Suppose that 
(T„(M^) < n~^. This means that there exists a unit vector v such that 

IIMMI < n-^. 

By rounding each coordinate v to the nearest multiple of n~^~^, we can find a 
vector V G n~^~^ ■ Z" of magnitude 0.9 < < 1.1 such that 

\\M{iv\\ < 2n-^. 

Writing w := n^~^^v, we thus can find an integer vector w G Z" of magnitude 
0.9n^+2 < ||w|| < l.ln^+2 such that 

IIM^wll < 2n2. 

Let Q be the set of integer vectors w e Z" of magnitude 0.9n^~^^ < \\w\\ < l.ln^+^. 
It suffices to show the probability bound 

P (there is some w € Q such that ||M^m;|| < 2n^) = O^.^C""^)- 

We now partition the elements w = {wi, . . . , Wn) of O into three sets: 
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• We say that w is rich if 

and poor otherwise. Let Cli be the set of poor w's. 

• A rich w is singular w if fewer than t?/''^ of its coordinates have absolute 
value n^^^° or greater. Let Q2 be the set of rich and singular w's. 

• A rich w is non-singular w, if at least n°'^ of its coordinates have absolute 
value n^~^° or greater. Let CI3 be the set of rich and non-singular w's. 

The desired estimate follows directly from the following lemmas and the union 
bound. 

Lemma 7.1 (Estimate for poor w). 

P{there is some w e Qi such that ||M^w|| < 2n^) = o(n~^). 
Lemma 7.2 (Estimate for rich singular w). 

P {there is some w e O2 such that \\Mi^w\\ < 2n^) = o{n-^). 
Lemma 7.3 (Estimate for rich non-singular w). 

P{there is some w e fls such that ||Af/;'w|j < 2n^) = o(n~^). 

Remark 7.4. Our arguments will show that the probabilities in Lemmas 7.2 and 
7.3 are exponentially small. 

The proofs of Lemmas 7.1 and 7.2 are relatively simple and rely on well-known 
methods. We delay these proofs to the end of this section and focus on the proof 
of Lemma 7.3, which is the heart of the matter, and which uses all the major tools 
discussed in previous sections. 

Proof [of Lemma 7.3] Informally, the strategy is to use the inverse Littlewood- 
Offord theorem (Corollary 2.7) to place the integers wi,. . . ,Wn in a progression, 
which we then discretize using Theorem 3.7. This allows us to replace the event 
ll^^u'll < 2n^ by the discretized event M^'^ = for a suitable Y, at which point 
we apply Corollary 3.11. 

We turn to the details. Since w is rich, we see from Corollary 2.7 that there exists 
a symmetric GAP Q of integers of rank at most A' and volume at most which 
contains all but [n^'^J of the integers Wi,. . . ,Wn, where A' is a constant depending 
on fjb and A. Also the generators of Q are of the form Wi/s for some 1 <i <n and 
1 < s < n^' . 

Using the description of Q and the fact that wi, . . . ,Wn are polynomially bounded 
(in n), it is easy to derive that total number of possible Q is nP^'W, Next, by 
paying a factor of 

(^J.,j)<nL"°-^J=exp(o(n)) 

we may assume that it is the last [n^'^J integers Wm+i, ■ ■ ■ ,Wn which possibly lie 
outside Q, where we set m := n — [n°'^J. As each of the Wi has absolute value 
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at most l.lrt^+^, the number of ways to fix these exceptional elements is at most 
(2.2n^"'"^)" = exp(o(n)). Overall, it costs a factor only exp(o(n)) to fix Q, the 
positions and values of the exceptional elements of w. 

Once we have fixed Wm+i, ■ ■ ■ ,Wn, we can then write 

MnW = WiXf + . . . + WmXl^ + Y, 

where y is a random variable determined by and Wi, m < i < n. (In this proof 
we think of X^^ as the column vectors of the matrix.) For any number y, let Fy 
be the event that there exists wi, . . . ,Wm in Q, where at least one of the Wi has 
absolute value larger or equal n^~^°, such that 

\wiX^ + ... + WmXii^ + y\<2n''. 

It suffices to prove that 

P{Fy)=oin-^) 

for any y. Our argument will in fact show that this probability is exponentially 
small. 

We now apply Theorem 3.7 to the GAP Q with Rq := n^/^ and S := to find a 
scale R = n^/^+^^C^) and symmetric GAPs Qsparse, Qsmaii of rank at most A' and 
volume at most such that 

• ^ ^ ^sparse ~t~ Ismail* 

• Qsmaii C [-n-i"i?,n-iOi?]. 

• The elements of n^°Qsparse are n^°i?-separated. 

Since Q (and hence n^^Q) contains wi,. . . , Wm, we can therefore write 

for all I < j < m, where w^'^'^^^'' e Qsparse and wf^^^^ e Qsmaii- In fact, this 
decomposition is unique. 

Suppose that the event Fy holds. Writing X-^ = {ri^i, . . . (where rj^ j are, of 

course, i.i.d copies of r]^) and y = {yi,. . . , yn), we have 

wi?7^;i + • • • + wmvtm = yi + c(n^). 

for all 1 < i < n. Splitting the Wj into sparse and small components and estimating 
the small components using the triangle inequality, we obtain 

wr'^'v^,! + ■■■ + ^T'^vtm = yi + 0{n-'R) 
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for all 1 < z < n. Note that the left-hand side lies in mQsp arse ^ ^ (^sparse; which 
is known to be n^°i?-separated. Thus there is a unique value for the right-hand 
side, call it j/-, which depends only on y and Q such that 

sparse , , sparse _ / 

The point is that we have now eliminated the 0() errors, and have thus essentially 
converted the singular value problem to the zero determinant problem. Note also 
that since one of the wi, . . . , Wm is known to have magnitude at least n^~^° (which 
will be much larger than ri^^R if B is chosen large depending on A), we see that at 
least one of the xif^'^^^'^ , . . . , w^'^^^^ is non-zero. 

Consider the random matrix M' of order m x m + 1 whose entries are i.i.d copies 
of rj^ and let y' G R™^^ be the column vector y' = {y[, . . . , y'jn+i)- We conclude 
that if the event Fy holds, then there exists a non-zero vector w € R™ such that 
M'w = y'. But from Corollary 3.11, this holds with the desired probability 

exp{-n{m + 1)) = exp{-n{nj) = 0(71"-^) 

and we are done. ■ 

Proof [of Lemma 7.1] We use a conditioning argument, following [20]. (An argu- 
nicint of the same spirit was used by Komlos to prove the bound 0(n~^/^) for the 
singularity problem [2].) 

Let M be a matrix such that there is w € Oi satisfying ||Mi(;|| < 2n^. Since M 
and its transpose have the same spectral norm, there is a vector w' which has the 
same norm as w such that ||w'M|| < 2n^. Let u = w'M and Xi be the row vectors 
ofM. Then 



where are the coordinates of w'. 

Now we think of M as a random matrix. By paying a factor of n, we can assume 
that w'^ has the largest absolute value among the w'. We expose the first n — 1 
rows Xi, . . . ,Xn-i of M. If there is G f2i satisfying jlMwH < 2n^, then there is 
a vector y Gfli, depending only on the first n — 1 rows such that 



{J2iXi-yf)'/'<2n\ 
Now consider the inner product X„ • y. We can write X„ as 
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^ 7t — J. 

Thus, 



\Xn-y\ = |T-7i|l"-y-X]"'^^''^l- 

The right hand side, by the triangle inequahty, is at most 

^(IHIIMI + KII(E(^^-y)W- 

By assumption 1 1 1 1 > Furthermore, as < 2n^, ||w||||y|| < 2n^||?/|| < 

3n^l|w'l| as = ||w|| and both y and w belong to fli. (Any two vectors in f^i 

has roughly the same length.) Finally {J2i=i i-^i ' vYY^"^ ^ 2n^. Putting all these 
together, we have 

\Xn ■ y\ < Sn^/'. 

Recall that y is fixed (after we expose the first n — 1 rows) and X„ is a copy of X'*. 
The probability that \X^' ■ y\ < f>rv'l'^ is at most (lOn^/^ + 1)P^(2/). On the other 
hand, y is poor, so ^^{y) < n~^~^°. Thus, it follows that 



P(there is some w e Qi such that ||M^u;|| < 2n'^) < n-^-^°(10n^/2+l)n = o(n-^), 

where the extra factor n comes from the assumption that w'^ has the largest absolute 
value. This completes the proof. ■ 

Proof [of Lemma 7.2] We use an argument from [15]. The key point will be that 
the set of rich non-singular vectors has sufficiently low entropy that one can 
proceed using the union bound. 

A set N of vectors on the n-dimensional unit sphere Sn-i is said to be an e-net 
if for any x G Sn-i, there is y G N such that ||a; — y\\ < e. A standard greedy 
argument shows 

Lemma 7.5. For any n and e < 1, there exists an e-net of cardinality at most 
0(l/e)". 

Next, a simple concentration of measure argument shows 
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Lemma 7.6. For any fixed vector y of magnitude between 0.9 and 1.1 
P(||M^2/|| < = exp(-17(n)). 

It suffices to verify this statement for the case \y\ = 1. Notice that 



i=l i=l 

where Zi = {Xi ■ j/)^. Tlic Zi are i.i.d random variables with expectation fj, and 
bounded variance. Thus Yli=i mean Q.{n) and the claimed bound follows 

from Chernoff's large deviation inequality (see, e.g., [28, Chapter 1]). (In fact, one 
can replace the by cn^/^ for some small constant c, but this refinement is not 
necessary.) 

For a vector w G fl2, let w' be its normalization w' := Thus, w' is an unit 

vector with at most coordinates with absolute values larger or equal n~^°. Let 
rig be the collection of those w' with this property. 

If \\Mw\\ < 2n2 for some w € ^2, then |lMu;'|| < Sn"-^ , as ||w|| > 0.9n^+2. Thus, 
it suffices to give an exponential bound on the event that there is w' € Og such 
that IIM^w'll < Sn-^. 

By paying a factor {J0.2) = exp(o(n)) in probability, we can assume that the 
large coordinates (with absolute value at least n~^°) are among the first I := nP "^ 
coordinates. Consider an n~'^-net in For each vector y G N, let y' be the 

n-dimensional vector obtained from y by letting the last n — / coordinates be zeros, 
and let N' be the set of all such vectors obtained. These vectors have magnitude 
between 0.9 and 1.1, and from Lemma 7.5 we have |A'''| < 0(n^)'. 

Now consider a rich singular vector w' e fl2 and let w" be the Z-dimensional vector 
formed by the first / coordinates of this vector. As the remaining coordinates are 
small \\w II = 1 + 0{n~^-^). There is a vector y £ N such that 

lly-^'ll < n-3 + 0(n-^-5). 
It follows that there is a vector y' e A''' such that 

\\y' - w'W < + 0{n-^-^) < 2n-\ 
For any matrix M of norm at most n 

\\Mw'\\ > \\My'\\ - 2n-^n = \\My'\\ - In''^ . 
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It follows that if ||Mu;'|| < 3n"^ for some B > 2, then ||My'|| < 5n~'^. Now 
take M = M^. For each fixed y' , the probability that < 5n~^ is at most 

exp(— r2(n)), by Lemma 7.6. Furthermore, the mimbcr of y' is subexponential (at 
most 0{n^y = 0{n)^^^ = exp(o(n))). Thus the claim follows directly by the union 
bound. ■ 



8. Discretization of progressions 



The purpose of this section is to prove Theorem 3.7. The arguments here are ele- 
mentary (based mostly on the pigeonhole principle and linear algebra, in particular 
Cramer's rule) and can be read independently of the rest of the paper. 

We shall follow the informal strategy outlined in Section 3.6. We begin with a 
preliminary observation, that basically asserts the intuitive fact that progressions 
do not contain large lacunary subsets. 

Lemma 8.1. Let P C 7i be a symmetric generalized arithmetic progression of rank 
d and volume V, and let Xi,. .. jX^+i be non-zero elements of P. Then there exist 
^<i<3<d+l such that 

C^^V-^\xi\ < \xj\<CdV\xi\ 

for some constant > depending only on d. 

Proof We may order ja^d+il > \xd\ > ■ • ■ > |- If we write 

P = {miVi + . . . + rridVd : \mi\ < Mi for all 1 < ? < c?} 

(so that V = 6d(Mi . . . M^)), then each of the xi,. . . , Xd+i can be written as a 
linear combination of the Vi,... ,Vd. Applying Cramer's rule, we conclude that 
there exists a non-trivial relation 

aixi + . . . + ad+iXd+i = 

where ai, . . . ,ad+i = Od{V) are integers, not all zero. If we let j be the largest 
index such that aj is non-zero, then j > 1 (since Xi is non-zero) and we conclude 
in particular that 

\xj\ = 0{\ajXj\)=Od{V\xj.,\) 
from which the claim follows. ■ 

Proof [of Theorem 3.7] We can assume that i?o is very large compared to (SV)'^''^^'' 
since otherwise the claim is trivial (take i^sparse '~ 

P and -fsmaii • — {0})- We can 

also take V >2. 

Let B = Bd he & large integer depending only on d to be chosen later. The 
first step is to subdivide the interval [{SV)'^''*^ Rq, (SV)'^''*^ Rq] into 9(5) over- 
lapping subintervals of the form [{SV)~^ * R,{SV)^ ^ R], with every integer 
being contained in at most 0{1) of the subintervals. From Lemma 8.1 and the 
pigeonhole principle we see that at most 0^(1) of the intervals can contain an el- 
ement of {SV)'^''P (which has volume 0((5F)'^<*(^'')). If we let B be sufficiently 
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large, we can thus find an interval [(51^) "■^ ^ R,{SV)^ ^ i?] which is disjoint from 
{SV)^ P. Since P is symmetric, this means that every x € (SV)^ P is either 
larger than (5y)^^^^i? in magnitude, or smaller than {SV)~^'^*^ R in magnitude. 

Having located a good scale R to discretize, we now split P into small (<C R) and 
sparse i?-separated) components. We write P explicitly as 

P = {mi'!;i + . . . + mdVd : |m,| < M, for all 1 < i < rf} 

so that V = Qd{Mi . . . M4) and more generally 

kP — {niiVi + . . . + m,dVd : |m; | < kMi for all 1 < i < d} 

for any A; > 1. For any 1 < s < B, let c Z"^ denote the set 

As := {(mi, . . . , TTid) ■■ \mi\ < V^" Mi for all 1 < i < rf; |mifi+. . .+m.dVd\ < {SV)~' 

Roughly speaking, this space corresponds to the kernel of $ as discussed in Section 
3.6; the additional parameter s is a technicality needed to compensate for the fact 
that boxes, unlike vector spaces, are not quite closed under dilations. We now view 
As as a subset of the Euclidean space R''. As such it spans a vector space Xg c R''. 
Clearly 

Xi C X2 C . . . C 

so if B is large enough, then by the pigeonhole principle (applied to the dimensions 
of these vector spaces) we can find 1 < s < B such that we have the stabilization 
property Xg = Xs+i- Let the dimension of this space be r, thus < r < d. 

There are two cases, depending on whether r = d or r < d. Suppose first that 
r = d {so the kernel has maximal dimension). Then by definition of Ag we have d 
"equations" in d unknowns, 

m[^Ki + ... + mfvd = 0{{SV)-'^''*' R) for all 1 < j < d, 

where mp'' = 0{MiV^') and the vectors {m[^\ . . . ,to|/^) € Ag are linearly inde- 
pendent as j varies. Using Cramer's rule we conclude that 

Vi = OdiiSVf-^^^'HSVy^^^'R) for all 1 < j < d 

since all the determinants and minors which arise from Cramer's rule are integers 
that vary from 1 to Od{V'^''^^^) in magnitude. Since Mi = 0(y) for all i, we 
conclude that x = OdiV'^''^'^°HSV)-^''^' R) for all x e P, which by construction 
of R (and the fact that s < B) shows that P c [-{SV)-^"^^ R, {SV)-^''^^ R] (if 
B is sufficiently large). Thus in this case we can take Psmaii = P and Pgparse = {0}. 

Now we consider the case when r < d (so the kernel is proper). In this case we 
can write Xg as a graph of some linear transformation T : IV R''"'": after 
permutation of the coordinates, we have 

Xg = {{x,Tx) e R'' X R'^-'^ : x e R'^}. 

The coefficients of T form anr x d—r matrix, which can be computed by Cramer's 
rule to be rational numbers with numerator and denominator Od{{SV)'^'^^^ ■*); this 
follows from Xg being spanned by Ag, and on the integrality and size bounds on 
the coefficients of elements of Ag . 
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Let m As he arbitrary. Since As is also contained in Xs, we can write m = 
{m[i^r],Tm[i^r]) for some m[i^r] € Z'' with magnitude Od{{SV)^'^^^'^). By defini- 
tion of As , we conclude that 

where := {vi,... ,Vr), ^[r+i.d] ■= {vr+i,--- ,Vd), and the inner products on 

R*^ and R''"'^ are the standard ones. Thus 

(mr,w[i,r] +T*v^r+i.d])BJ' = OiiSVy^''^' K) 

where T* : R'^"'' R'^ be the adjoint linear transformation to T. Now since A 
spans X, we see that the m[i^r] will linearly span R'' as we vary over all elements 
m of A. Thus by Cramer's rule we conclude that 

+3^*%+l,<i] = Od(yO''(^')(5y)-^"^'i?). (18) 

Write (wi, . . . , Wj.) ■= T*V[j.+i,d]^ thus Wi, . . . ,Wr are rational numbers. We then 
construct the symmetric generalized arithmetic progressions Psmaii and Pgparse ex- 
plicitly as 

-Psparse := {miWi+. . .+mrWr+mr+iVr+i + . . ■+mdVd : \mi\ < Mi for all 1 < i < d} 
and 

-Psmaii := {'mi{vi + wi) + . . . + mr{vr + Wr) \mi\ < Mi for all 1 < z < d}. 

It is clear from construction that P C Pgparse + -Fsmaii, and that Psparse and Psmaii 
have rank at most d and volume at most V. Now from (18) we have 

v,+Wi = Od{{SVf^^^'\SV)-^^^'R) 

and hence for any x G Psmaii we have 

X = Od{{SVf-^'''\SV)-''"^" R). 

By choosing B large enough we conclude 

\x\ < R/S 

which gives the desired smallness bound on Psmaii- 

The only remaining task is to show S'Psparse is sparse. It suffices to show that 
'S'Psparse — 'S'-Psparse has no non-zero intersection with l—RS,RS]. Suppose for con- 
tradiction that this failed. Then we can find mi, . . . ,md with |mi| < 2SMi for all 
i and 

< miWi -|- . . . -|- nirWr + nir+lVr+l + . . . -|- mdVd < RS. 

Let Q be the least common denominator of all the coefficients of T*, then Q = 
Od{{SV)^'^^^'^). Multiplying the above equation by Q, we obtain 

< miQwi+. . .+mrQwr+mr+iQvr+i+. . .+mdQvd < 0{RSV°''^^">) < (SV)^"*' R. 

Since {wi, . . . , w,.) = T*V[j.+i^r+d]: the expression between the inequality signs is an 
integer linear combination of u^+i, ... ,Vd, with all coefficients of size Od{{SV)'^'^'^^"^), 
say 

miQwi -I- ... -I- rUrQwr + rrir+iQvr+i -|- . . . -|- rudQvd = Or+iUr+i + . . . + adVd- 
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In particular we see that this expression hes in (SV)^ P (again taking B to be 
sufBciently large). Thus by construction of R, we can improve the upper bound of 
(5y)^"^'i?to [SVy'^^'R: 

< a^+iUr+i + . . . + adVd < {SV)-'^"^' R. (19) 

Taking B to be large, this implies that (0, . . . ,0, a^+i, . . . , ad) lies in Xg+i, which 
equals X^- But Xg was a graph from R*^ to R and thus a^+i = . . . = = 0, 
which contradicts (19). This establishes the sparseness. ■ 



9. Proof of Theorem 3.10 

Let Y = {j/i, . . . , 2/;} be a set of Z independent vectors in R". Let us recall that 
M^'^ denote the random matrix with row vectors Xf, . . . , X^_^, yi,. . . ,yi, where 
are i.i.d copies of X^ = (ryf . . . , 77^). 

Define S{fi) := max{l — /x, yu/2}. It is easy to show that for any subspace V of 
dimension d 

P(X^ eV)< (5(/i)'^"". (20) 

In the following, we are going to use A'' to denote the quantity (l/(5(^))". As 
0</i<l,(5(/Lt)>0 and thus N is exponentially large in n. Thus it will suffice to 
show that 

P(M^''^ singular ) < 7V-^+°(i) 

for some e = e(/i, I) > 0, where the o(l) term is allowed to depend on /i, Z, and 
e. We may assume that n is large depending on fi and I since the claim is trivial 
otherwise. 

Notice that if M^'^ is singular, then the row vectors span a proper subspace V. 
To prove the theorem, it suffices to show that for any sufBciently small positive 
constant e 

J2 P • • • , K-v yu...,yi span V) < 7V--+«(i). 

y,yproper subspace 

Arguing as in [25, Lemma 5.1], we can restrict ourselves to hyperplanes. Thus, it 
is enough to prove 

^ P(Xf , . . . , yu...,yi span V) < N--+"('\ 
y.yhyperlane 

Clearly, we may restrict our attention to those hyperplanes V which are spanned 

by their intersection with {—1,0,1}", together with yi,... ,yi. Let us call such 
hyperplanes non-trivial. Furthermore, we call a hyperplane H degenerate if there 
is a vector v orthogonal to H and at most log log n coordinates of v are non-zero. 
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Following [25, Lemma 5.3], it is easy to see that the number of degenerate non- 
trivial hyperplanes is at most N°^^\ Thus, their contribution in the sum is at 
most 

which is acceptable. Therefore, from now on we can assume that V is non- 
degenerate. 

For each non-trivial hyperplane V, define the discrete codimension d{V) of V to 
be the unique integer multiple of 1 /n such that 

d(V) 1 d(V) 

N — n — ^ <-p{X'' eV) < N — —. (21) 
Thus d{V) is large when V contains few elements from {—1, 0, 1}", and conversely. 

Let By denote the event that Xf, . . . , X^_i, yi,. . . ,yi span V. We denote by 
the set of all non-degenerate, non-trivial hyperplanes with discrete codimension d. 
It is simple to sec that 1 < d{V) < n? for all non-trivial V . In particular, there are 
= A'^°(i) possible values of d, so to prove our theorem it suffices to show that 

P(By) < Ar-=+°(i) (22) 

for alll < d < n^. 

We first handle the (simpler) case when d is large. Note that if . . . , X!^_i, yi,. . . ,yi 
span V, then some subset oi n — I — 1 vectors Xi together with the yj 's already 
span V (since the yj's are independent). By symmetry, we have 



P{Bv)<{n~l) Y P(^f,...,^^,_i,yi,...,yispanl^)P(X^_,ey) 
vend veiid 

<nN-i Y PW,--- ,^^(-i,yi,---,y! spany) 

vend 

This disposes of the case when d> en. It remains to verify the following lemma. 

Lemma 9.1. For all sufficiently small positive constant e, the following holds. If 
d is any integer multiple ofl/n such that 

1 < d < {s - o{l))n (23) 

then we have 

vena 

Proof For < /i < 1 we define the quantity < /z* < 1/8 as follows. If /i = 1 then 
H* := 1/16. If 1/2 < /z < 1, then /x* := (1 - n)/4. If < /z < 1/2, then /z* := /x/4. 
We will need the following inequality, which is a generalization of [25, Lemma 6.2]. 
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Lemma 9.2. Let V he a non-degenerate non-trivial hyperplane. Then we have 
P(X^ G V^) < + o{l))P{X^'' G V). 

The proof of Lemma 9.2 relies on some Fourier-analytic ideas of Halasz [9] (see 
also [10], [25], [26]) and is deferred till the end of the section. Assuming it for now, 
we continue the proof of Lemma 9.1. 

Let us set 7 := |; this is not the optimal value of this parameter, but will suffice 
for this argument. 

Let Av be the event that ,X^_^^^,X'^,... ,X'^^_^-^^ are linearly inde- 

pendent in V, where X-^ 's are i.i.d copies of X*** and Xj's are i.i.d copies of 

Lemma 9.3. 

Proof Notice that the right hand side on the bound in Lemma 9.3 is the probability 
of the event A'y that xf,. . . ,X^*_^^^,X'^, . . . belong to V. Thus, by 

Bayes' identity it is sufficient to show that 

P{Av\A'y) = N'^'-^K 

From (21) we have 

P(X'' eV) = {l + 0{l/n))6{fi)'^ 
and hence by Lemma 9.2 

P(X''* e y) > (2 + 0{l/n))6{i^f 
On the other hand, by (20) 

P(X''* eW)<{l- ^*)»-dim(W') 

for any subspace W. By Bayes' identity we thus have the conditional probability 

bound 

P(X''* G WlX^i"'^ €V)< (2+0(l/n))-M(/i)-'^(l-/i*)"-^™(^) < (5(m)-''(1-m*)' 

When dim(M^) < (1 — 7)n the bound is less than one when e is sufficiently small, 
thanks to the bound on d and the choice 7=5. 

Let -Efe be the event that X!^ , Xj^ are linearly independent. The above 
estimates imply that 



(24) 
(25) 



F{Ek+i\Ek AA'y)>l- S{tx)-'^{1 - M*) 
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for all < fc < (1 — 7)n. Applying Bayes' identity repeatedly we thus obtain 

To complete the proof, observe that since 

for any subspace W, and hence by (24) 

P(X'' e WlXi" €V)<{1 + 0(l/n))(5(M)-'^5(/x)"-*^™('^^ 

Let us assume and denote by W the (1— 7)n-dimensional subspace spanned 

by , . . . , X^^_^s^^. Let Uk denote the event that X^, . . . , X^, W are liearly independent. 
We have 



for all < fc < (7 — e)n, thanks to (23). Thus by Bayes' identity we obtain 



0<fe<(7-£)n 

as desired. 



Now we continue the proof of the theorem. Fix V e Cld- Since Ay and By are 
independent, we have, by Lemma 9.3 that 



p^By) = ^^p^^^^^^^ < Ar-(l-^)+(l-^)'^+°(l)p(Ay A By). 

Consider a set 

yfi* T^/i* 'Y'^ Ir^ ■Jf y 

^1 5 • • • ) ^(1_T,)„) 1 ) • • • ) (7-£)n> -^1 > • • • ) ^n-l 

of vectors satisfying AyABy. Then there exists en — Z— 1 vectors Xj^, . . . , Xj^^_^_^ 
inside , . . . , X'^_; which, together with 

X^ ,X^^_^^^,Xi, . . . ,X^^_^^^,yi, . . . ,yi 

spanF. Since the number of possible indices ji ,.. . , is = 2('*(^)+°(^))' 

(with /i being the entropy function), by conceding a factor of 



2{h(e)+o{l))n _ jqah(e)+o{l) 
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where a — logi/^j^^-j 2, we can assume that ji = i for all relevant i. Let Cv be the 
event that 

^1 , ^(i-y)„> ^1 > • • • > x'^j-e)n^ . • • • > yi, • • • , span f. 

Then we have 

P{Bv) < Ar-(i-7)+(i-e)d+a/»(e)+o(i)pJ^^^ ^ . . . , in y)). 
On the other hand, Cy and the event {X^n, ■ ■ ■ j-^n in V) are independent, so 

p(Cy A (X,^„, . . . in V)) = P(Cy)P(X'' e 

Putting the last two estimates together we obtain 

P(By) < A/'~^^~''')+(^~^^'^+'*'*^^^+°(^^iV~('^^~^^"+^~'^'^/"P(Cy) 

Since any set of vectors can only span a single space V, we have J^vena ^i^v) < 1- 
Thus, by summing over fl^, we have 

With the choice 7 = i, we obtain a bound of Af^'^+°(i) as desired, by choosing e 
sufficiently small. This provides the desired bound in Lemma 9.1. ■ 



9.4. Proof of Lemma 9.2. To conclude, we prove Lemma 9.2. Let = (ai, . . . , a„) 
be the normal vector of V and define 

n 

Pi^iO ■■='[[{{1- n)+HCOs2TTaiO. 
Prom Fourier analysis we have (cf. [25]) 

P(x^ ev) = P(X'^ ■v = o)= I F^iOd^. 

Jo 

The proof of Lemma 9.2 is based on the following technical lemma. 

Lemma 9.5. Let fii and 112 be a positive numbers at most 1/2 such that the fol- 
lowing two properties hold for for any ^, ^' e [0, 1] . 
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and 



Furthermore, 



Then 



F^AO<F^.iO^ (26) 
F,A0F,^{O<F,A^ + er- (27) 



f F,,{Od^ = o{l). 
Jo 



(28) 



/' F^, (0 < (1/2 + 0(1)) /' F^, (0 dt (29) 
Jo Jo 

Proof Notice that since /ii,/i2 < 1/2, -P/ii(0 and -F)i2(0 positive for any ^. 
Prom (27) we have the sumset inclusion 

{e G [0, 1] : F^.iO >a} + {^G [0, 1] : F^,{0 > "} ^ ^ G [0, 1] : F^,{0 > a} 

for any a > 0. Taking measures of both sides and applying the Mann-Kneser- 
Macbeath "a + /3 inequality" |A + B| > min(|A| + 1) (see [17]), we obtain 

min(2|{C G [0, 1] : F^,(?) > a}|, 1) < |U G [0, 1] : F^,{0 > 

But from (28) we see that \{^ G [0,1] : -^^^(O > «}[ is strictly less than 1 if 
a > o(l). Thus we conclude that 

lU G [0,1] : F^,(0 > a}\ < i|U G [0,1] : F^,{0 > a}\ 
when a > o(l). Integrating this in a, we obtain 

/ F^,{Od^<U' F^,{^)d^. 

JlOA]:F^,(i)>o{l) 2 Jo 

On the other hand, from (26) we see that when Ff^^{^) < o(l), then -Fjui(^) = 
o{F^.AO'^^) = o{F^,,{0), and thus 

/ F^.iO di<o{l) ( F^, d^. 

J[0,l]:F^,{i)<oil) Jo 

Adding these two inequalities we obtain (29) as desired. ■ 
By Lemma 5.1 

PiXi^ ■v = 0)< P^v) < P^/4(v) = /' F^/^iOd^. 

Jo 

It suffices to show that the conditions of Lemma 9.5 hold with /xi = and 
= = /i/16. The last estimate -F)ii(C) d$, < o(l) is a simple corollary of the 
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fact that at least log log n among the Oi arc non-zero (instead of log log n, one can 
use any function tending to infinity with n), so we only need to verify the other 
two. Inequality (26) follows from the fact that /Lt2 = and the proof of the 
fourth property of Lemma 5.1. 

To verify (27), we suffices to show that for any fi' < 1/2 and any 9, 9' 



((1 - + m'cos^)((1 - fi') + n' cos 9') < ((1 - fi'/4) + ^ cos{9 + 9'f. 

The left hand side is bounded from above by ((1 — /i') + /u'cos^^^)^, due to 
convexity. Thus, it remains to show that 

(1 - m') + cos ^ < (1 - ^) + ^ cos(^ + 9') 

since both expressions are positive for /x' < 1/2. By defining x := cos the last 
inequality becomes 

(l_^') + ^':,<(l_^) + ^(2:,2_i) 
which trivially holds. This completes the proof of Lemma 9.2. 
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